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Abstract 



We introduce the dispersion models with a regression structure to extend the 
■ generalized linear models, the exponential family nonlinear models (Cordeiro and 

c/3 . Paula, 1989) and the proper dispersion models (J0rgensen, 1997a). We provide a 

matrix expression for the skewness of the maximum likelihood estimators of the 
regression parameters in dispersion models. The formula is suitable for computer 
implementation and can be applied for several important submodels discussed in 
00 '. the literature. Expressions for the skewness of the maximum likelihood estimators 

of the precision and dispersion parameters are also derived. In particular, our 
results extend previous formulas obtained by Cordeiro and Cordeiro (2001) and 
, Cavalcanti et al. (2009). A simulation study is perfomed to show the practice 

I importance of our results. 
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1 Introduction 

The assumption of symmetry plays a crucial role in many statistical procedures. The no- 
tion of skewness of a distribution is related to a symmetry property. The most commonly 
used measure of skewness is the standardized third cumulant defined by 71 = k^/kI^"^, 
where Kr is the rth cumulant of the distribution. In fact, the classical tests of symmetry 
use the standardized third sample cumulant measure. A departure from the normal value 
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of zero then indicates skewness. Intuitively, we think of a distribution as being skewed 
if it systematically deviates from symmetry by leaning to one side. Clearly, if the dis- 
tribution is symmetrical, 71 vanishes and therefore its value will give some indication of 
the extent of departure from symmetry. However, there are asymmetrical distributions 
with as many zero-odd order central moments as desired, so the value of 71 must be 
interpreted with some caution. When 71 > (71 < 0), the distribution is positively 
(negatively) skewed and will have a longer (shorter) right tail and a shorter (longer) left 
tail. 

The value of the index 71 has been suggested as a possible measure of non-normality 
of the distribution. We are concerned with the asymptotic skewness of the distribution 
of the maximum likelihood estimators (MLEs) in the class of dispersion models (DMs) 
(J0rgensen, 1997b). This class of models represents a collection of probability density 
functions that contains as sub-models: the proper dispersion models (PDMs) (J0rgensen, 
1997a) and the well-known one-parameter exponential families. 

We assume that the random variables Yi, . . . ,¥„, are independent and each Yi has a 
probability density function (pdf) of the form 



where t{-,-) and a(-, ■) are known functions, > and fi varies in an interval of the 
real line. If Y is continuous, vr is assumed to be a density with respect to Lebesgue 
measure, while if Y is discrete, vr is assumed to be a density with respect to counting 
measure. We call the precision parameter and o"^ = the dispersion parameter. 
Similarly, the parameter /x may generally be interpreted as a kind of location parameter 
not necessarily the mean of the distribution. In practice, certain simplifications may 
be desirable. Exponential dispersion models (EDMs) represent a special case of DMs 
for t{y,fi) = 6y — b{6), where fi = b'{6). The PDMs are also a special case of ([T]) for 
a{(f),y) = di{(j)) + d2{y), where di{-) and d2{-) are known functions. 
We introduce a regression structure to ([T]) 



where Xi = {xn, . . . , Xjm)^ is an m- vector of non-stochastic independent variables associa- 
ted with the ith response, /3 = (/3i, . . . , /3p)^ is a p-vector of unknown parameters, h{-) is 
a known one-to-one twice continuously different iable function, usually referred to as the 
link function, and /(■;■) is a possibly nonlinear, twice continuously different iable function 
with respect to (3. The regression structure relates the covariates Xi to the parameter 
of interest /ij. The n x p matrix of derivatives of 77 with respect to (3, specified by 
X = X(/3) = drj/dp, is assumed to have rank p for all (3. The DM defined by equations 
([T]) and ([2]) is a general model that allows for parsimonious representation. We assume 
that the usual regularity conditions for maximum likelihood estimation and large sample 
inference hold; see Cox and Hinkley (1974, Chapter 9). 

From now on, the term "dispersion model" (denoted simply by DM) represents a 
regression model specified by ([T]) and ([2]) that allows for parsimonious representation. 
For DMs, Rocha et al. (2009) obtained a matrix expression for the covariance matrix 
of the MLEs up to order 0(n~^), where n is the sample size, Simas et al. (2009a) 
calculated the second-order biases of the estimators of the parameters and Simas et al. 



7r(2/; /ii, 0) = exp{(f)t{y, fii) + a(0, y)}, ye M, 



(1) 



hifJ'i) = Vi = f{xi]f3) 



(2) 



2 



(2009b) studied asymptotic tail properties for some distributions belonging to the class 
of dispersion models. 

The DMs extend the exponential family nonlinear models (EFNLMs) (Cordeiro and 
Paula, 1987), since they contain many distributions that are not in the exponential family 
form, whereas the EFNLMs generalize the well-known generalized linear models (GLMs), 
since they allow a nonhnear regression structure. Paula (1992) derived general expressions 
for the second-order biases of the MLEs in EFNLMs, thus extending previous result by 
Cordeiro and McCullagh (1991) for GLMs. Wei (2004) wrote an excellent book on these 
models. More recently, Simas and Cordeiro (2009) proposed corrected Pearson residuals 
in EFNLMs and Simas et al. (2009a) proposed corrected MLEs in DMs, thus extending 
the results by Cordeiro and McCullagh (1991) and Paula (1992). 

The PDMs contain several important non-exponential models, for instance, the von- 
Mises regression model for data distributed along the unit circle and the simplex model 
for data distributed in the standard unit interval (0, 1). A complete study of PDMs is 
presented by J0rgensen (1997b). 

Few attempts have been made to develop second-order asymptotic theory for DMs 
in order to have better likelihood inference procedures. An asymptotic formula of order 
n'^/^ for the skewness of the distribution of /3 in GLMs was derived by Cordeiro and 
Cordeiro (2001). In this article, we provide asymptotic formulae for the third cumulants 
of the distributions of the MLEs of the regression parameters /3, precision parameter 
and dispersion parameter cr^ in DMs thus extending the results by Cordeiro and Cordeiro 
(2001). The formulae are useful to define the skewness of these distributions corrected to 
order n~^/^. The knowledge of the skewness can be used as a measure of departure of these 
distributions from normality. We consider asymptotic results for likelihood inference with 
respect to the vector j3 of parameters and scalars and cr^ for large n. 

The rest of the paper is organized as follows. In Section 2, we apply the general 
formula for the third cumulant of the MLE given by Bowman and Shenton (1998) to 
obtain a simple expression for the skewness of the distribution of the MLE (5. Section 3 
is devoted to the skewness of the distributions of the MLEs (j) and a^. In Section 4, we 
apply our main result to a number of important special models. In Section 5, we provide 
simulation results for the reciprocal gamma nonlinear model to investigate the skewness 
of the MLEs in DMs and to motivate the use of the proposed formula. Some concluding 
remarks are given in Section 6. 

2 Skewness of ^ 

In this section, we derive the skewness of the MLEs of the parameters (3 in DMs. Consider 
the observations . . . , y„ and let I — £{/3, 4>) be the total log-hkelihood function for /3 
and (f). We assume that the usual regularity conditions for maximum likelihood estimation 
and large sample inference hold (Cox and Hinkley, 1974, Chapter 9). A simple calculation 
shows that E{d'^i/d(f)df3) = 0, and then the parameters /3 and are globally orthogonal 
(Cox and Reid, 1987). Let ^ and be the MLEs of /3 and (p, respectively, and Hi = h~^{rii) 
be the inverse link function. Then, the unit deviance for the DM, given the data vector 
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y, is defined by 

n 

D{y,fi) = 2^[supt(?/i,/i) -t(yi,/ii)]. 
i=i ^' 

The MLE of /3 can be calculated by minimizing the deviance D{y,fi) with respect to /3. 
The maximum likelihood equations for /3 do not depend on the precision parameter 
and are given by X'^t'{y,fi) = 0, where t'{y,fi) = dt{y, fi)/dfi is an n x 1 vector. These 
nonlinear equations have the same form of the standard estimating equations for GLMs 
and can be solved by iterative methods. Alternatively, we can maximize directly minus 
the deviance —D{y, ft), for example, using some standard statistical software such as SAS 
or the GAMLSS package in R. 

Given the estimate /3, the MLE of is obtained as the solution of the nonlinear 
equation 

n 1 " 

^a'{yi,(j)) = -D{y,^) - ^ sup /x), 
1=1 i=i ^ 

where a'(0, y) = da{(j), y)/d(f). The MLE of the precision parameter is a function of the 
deviance of the model. The MLE of the dispersion parameter cr^ is o"^ = 0~^. 

We define dr = dr{fi,4>) = E{d^t(Y, fj,) /dfi''} for r = 1,2,3. From some regularity 
conditions, we have di = and d2 = —(j)E{[dt(Y, fi)/dfi]}^. We shall use the following no- 
tation for the derivatives of the log likelihood function £ = i{l3, 0): k^s = E{d'^i/dPrd(3s), 

Krst = E{dH/dMMPt), '^r,s = E{^ / d(i / 8(3,) , Kr,s,t = E{d(i / / d(5, / d^t) . 

Kr^st = E{di/d/3r d'^i/d(3sd(3t), etc. Note that = —t^rs and that K,rs,t is the co- 
variance of the first derivative of i with respect to /3j with the mixed second deriva- 
tive with respect to Pr and Pg. All /t's refer to a total over the sample and are, in 
general, of order n. The total Fisher information matrix has elements K,r,s = —i^rs 
and let k"^'^ be the corresponding elements of its inverse. The joint information ma- 
trix for 7 = {13'^, (pY is = diag{0X'^W^X, na^^)}, where W = diag{-d2{dfi/dr]y} 
and a*^^) = a'^'^\iJ,,(j)) = —E{d'^a{(j),Y)/d(j)'^}. The MLEs of /3 and are asymptotically 
independent due to their asymptotic normality and the block diagonal structure of the 
joint information matrix K^. 

We introduce the notation (r)j = drii/df3r, (yf^,s)i = {drii/df3r){drii/df3s), ij')St)i = 
{dr]i/d(3r){d'^Vi/d(3sd(3t), etc. Let ti30a) = E{0a - PaY} be the third cumulant of the 
MLE Pa of Pa for a = 1, . . . ,p. From the general expression for the multi-parameter 
third cumulants of the MLEs given by Bowman and Shenton (1998), we can write to 
order 

In equation (|3]), Yl' denotes the summation over all p + 1 parameters . . . and 0. 
Let S be the summation over the observations. The key for obtaining a simple expression 
for KslPa) in DMs is the invariance of the k's under permutation of parameters P's and 
the orthogonality between and P (Cox and Reid, 1987), i.e., E{—d'^i/dpd(f)) = 0. 
After some calculation and using the notation of Cordeiro et al. (1994), we obtain 

n 

i^rst = -(p^iif + 2g)i{r, s,t)i + Wi{{r, st)i + {s,rt)i + {t,rs)i}], 

i=l 
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i^r,st = - e)i(r, s, t)i + Wi(r, st)i] and K^^^^i = -0 ^[(2/ - 2g - 3e)i(r, s, t)^ 

1=1 i=l 

where 

J2, 



drj drj'^ ydr] J drj dr]'^ 

e = — ( — I do and if = — I — I c/o, 



where ^2 is the first partial derivative of d2 with respect to /i. Because of the orthogonahty 
between and /3, we have only to take into account in equation ([3]) the sum of terms 
involving the various combinations of the parameters /3. Hence, the crucial quantity 
i^r,s,t + 'if^rst + 6/trs,t for the third central moment of (3a is given by 

n 

Kr,s,t + ^t^rst + Qi^rs,t = ^ [(/ - 4^1 - 3e)i (r, s, t)i + 3wi{ (r, st)i - (s, rt)i - (t, rs)i}] . (4) 

1=1 

Inserting ([1]) in ([3]), inverting the order of the summation and rearranging, we obtain 

n / p \^n/p \ / ^ \ 

^30a) = 0^(/-4<7-3e), ^«:'^'^(r), -30 J^^, JZ'^'^'^W. Yl i^t) ^ ■ 

1=1 \r=l / 1=1 \r=l / \s,t=l J 

Let 4>Kj3 be the information matrix for /3, where i^^ = X^WX. Also, let and 5i be 
1 X p and n X 1 vectors of zeros with one in the ath and zth components, respectively. 
Thus, X]r=i f^"''^{r)i = (p~^PaK(^^X'^5i. Further, let Xj be a p x p matrix with elements 
d\/d(3rd/3s. Then, E!,t=i = (p-^p^K^^X^K^^pa. We define the matrices of 
order p x n: M = {m,J = K^'X^ and N = {n,,} = {p^K^'X^K^'pa}. The 0{n-') 
third cumulant of /3a is 

n 3 n 

^30a) = 0E(/ - 4^ - - 30 E^^^' 



1=1 ^ i=l 



where mai is the (a, z)th element of the matrix M. Let k.3{i3) = (k3(/3i), . . . , fi;3(/3p))"^ be 
the p X 1 vector of the third cumulants of the /3's. The third cumulant vector has a 
simple expression 

= ^{M(3)(/ -4g- 3e) - 3(M iV)^;}, (5) 

where / = (/i, • • • , /n)^, 5' = idi, QnV, e = (ei, . . . , e^)^ and w = {wi, WnV are 
n X 1 vectors, whose elements were previously defined, M^^^ = M Q M Q M, and is 
the Hadamard (direct) product. Expression ([S]) is a function of the model matrix X, 
the matrices Xi for i = l,...,n, the first three derivatives of the function ■) with 
respect to p and the unknown /i's. The third cumulant vector is easily computed since it 
involves only simple operations on matrices and vectors. The vector K3{(3) is weighted by 
the inverse of the square of the precision parameter. Equation ([5]) generalizes previous 
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results obtained by Cordeiro and Cordeiro (2001) and Cavalcanti et al. (2009) for GLMs 
and EFNLMs, respectively. 

From the third cumulant vector ([5]) and the asymptotic covariance matrix Cov(/3) = 
(j)~^{X'^WX)~^ of /3, we can easily obtain the asymptotic skewness 7i(/3a) = 
't3(/3a)/Var(/3a)'^/^ of the distribution of the estimate (3a of the regression parameter /3a 
for a = 1, . . . ,p. Clearly, 7i(/3a) is of order n~^/^ and is weighted by the inverse of the 
square root of the precision parameter (p. Thus, the normal approximation for the distri- 
bution of (3 deteriorates when (j) decreases, which is consistent with the small dispersion 
asymptotics phenomenon noted by J0rgensen (1987b). The parameters (p and /x should 
be replaced by consistent estimators (p and /i to obtain a numerical value for 7i(/3a). We 
can use the estimate of the skewness 7i(/3a) as an indicator of departure from the normal 
distribution of 

By evaluating the skewness in ([5j), we can obtain an approximate Edgeworth expansion 
for the density function of the estimate /3a, whose leading terms are 

where (/){x) is the standard normal density function and H^{x) = — 3x and Hq{x) = 
x^ — 15x'^+45a;^ — 15 are Hermite polynomials, which should work better than the standard 
normal distribution. 



3 Skewness of 6 and a 



We provide general formulae for the n~'^ third cumulants of the MLEs of the precision 
and dispersion parameters in DMs. First, we consider the third cumulant of the estimate 
derived in Section 2 as a solution of a nonlinear equation. Let a'-''-' = E{(da{(/), Y)/(/)Y} 
and ar,s = E{d^a{(/),Y)/d(/>'^d'^a{(/>,Y)/d(/)'^}. From the orthogonality between and f3, 
equation ([3]) yields 

Ks{(/)) = k"^'"^ (/«(/.,</,,(/. + 3K<i>^<f> + Qk<p<p,<p)- 

Let K^^^ = a;'^^\ f^<f>(i>,<f> = ct2,i, = cts.o and k^,^,^ = a^^^. Thus, the third cumulant of 
becomes 

a^^^ + 3a3,o + 6a2,i 
'^3(0) = pyp • (6) 

From equation Q and the asymptotic variance Var(0) = [a*-^-*] ^, we obtain the asymp- 
totic skewness of as 

2^ a(^) + 3^3,0 + 6a2,i 

^'^^^ = pypTi ■ 

We write ([T]) in terms of a"^ = (/)^^ 

Tc{y;iJ,i,a'^) = exp{a-H{y,iJ,i) + a^{a'^,y)}, ?/ G M, (7) 

where a^:{cr^,y) = a{a^^,y). A straightforward calculation shows that and /i are 
orthogonal parameters. Let a^''^ = E{d^a^{a'^ ,Y) / d{a'^y} / d{a'^y . We can obtain 
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the cumulants k„2^„2 = —2a ^a;°'^ — a;°'^, k^2^2^2 = —6a ^a;°'^ + k^2^2^^2 = al'"^ + 
2a~^ay + Aa-^a^/ - a^'^ and ^^2,^2,^2 = -6a~^ay - Sa^'^ - 6a-^a°'^ + 2aP. From 

211 12 03 

Ko-2 0-2 0-2 + 3fi:o-2o-2o-2 + 6ko-2o-2 0-2 = —6a a^' + 3a^' — a^' , 



we have 



-2 0,1 I 0,2\3 

[2a ^a* + a* j 

From equation ([8]), the asymptotic skewness of the distribution of o"^ is given by 



71 (a 



a. 



0,3 



o -2 0,1 

-2a ^a^ 



0,2\ 

- a* j 



0,2\3/2 



4 Some Special models 

Here, we examine some special cases of formulas ([5]), and ([8]). Some other special 
cases could be easily derived because of the advantage of the explicit matrix expression 
which is easily implemented in statistical packages or in a computer algebra system 
such as Mathematica or Maple. Table [T] lists the most common link functions and the 
quantities required for the skewness of the MLE /3, where $(■) is the standard normal 
distribution function, (f){x) is the density of the standard normal distribution and 
is its first derivative. 



Table 1: The most common link functions and their derivatives. 



Link 


Formula 


dfi/di] 




Logit 


log(/i/(l -/i)) = V 


/i(l -/i) 


/i(l-/i)(l-2/i) 


Probit 




(/.(•K-H/x)) 




Log 


log(/i) = 7] 






Identity 


H = ri 


1 





Reciprocal 


/i-i = r] 






Square reciprocal 


= r] 


-^^'/2 


3yuV4 


Square Root 


\/J^ = V 


2^ 


2 


C-loglog 


log(-log(l - n)) = r] 


-log(l-/i)(l-^) 


-(l-/i)log(l-/i) 








x(l + log(l-/i)) 


Tangent 


tan(/i) = T] 


cos(/i)^ 


2 cos(/i)^ sin(yu) 



4.1 Generalized Linear Models 

We calculate the skewness of the MLE /3. The function ■) has the form t{y,6) = 
yO — b{6), where the mean value is /i = t{6) = b'{6) and the variance function V = 
V{fi) is related to the mean by dT^^{fi)/dfi = . We have r^-'^(yu)} = yr^^^fi) — 
6{r-i(/i)}. For GLMs, rfs = -V~^ and 4 = 21^-2y(i)^ ^^ieie V^^^ = dV{fx)/dfx, W = 
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Table 2: Expressions of V and its derivatives for distributions in the exponential family. 



Distribution 


V 


1^(1) 




Normal 


1 








Poisson 




1 





Binomial 




1 -2/i 


-2 


Gamma 




2/i 


2 


Inver. Gaussian 




3/i2 


6;U 



{V ^{d^/dr]Y}, X reduces to the matrix X, h{fii) = rji = xj(3 and vanishes. From 
the matrix M = {rriai} = {X^WX)~^X'^ and by formula (|5]), we obtain 



which is identical to the result by Cordeiro and Cordeiro (2001). Table [2] lists the distri- 
butions in the exponential family and the quantities required for the skewness. 

We also calculate the skewness of the estimators of and cr^ for two-parameter expo- 
nential family distributions with canonical parameters and (f)6. We have a{(f),y) = 
(j)c{y) + ai(0) + a2(y), where c(-) is a known function. We have a^'^^ = —na'^cp), 
0^3,0 = na'^^cp), a^^^ = —na'l'{(l)) and a2,i = 0. From ([6]), we obtain 



«^3(0) 

and the skewness becomes 

7i(0) 



n2a;'(0)=^^ 
2<(0) 



These expressions agree with the results by Cordeiro and Cordeiro (2001). We define 
^((T^) = ai((T~^). From similar calculations, and using (|8]), the second-order third cumu- 
lant of (T^ can be expressed as 



which yields 

u2\ 



71^(7 



v/7^{-2e((T2) -(T2e(a2)}3• 
Table [3] lists the skewness of the MLEs of the parameters and a^. The function 
ai (0) is equal to log a/0, log(0) — log r(0) and log a/0 for the normal, gamma and inverse 
Gaussian distributions, respectively. Here, r(-) is the gammma function and is the 
digamma function. 
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Table 3: Skewness of ch and o"^. 



Normal 
Gamma 

Inver. Gaussian 



<A"(a-2) , 3V;'(<T-2) 2 

— — ?j 



Distribution 




71 (0) 


Normal 
Gamma 

Inver. Gaussian 


2</,(l+</,V" {</>)] 
n2[l-</„/)'(</))]3 

?i2 


25/2 
\/n 

-2[,^"(</,)+</,-2] 

V^[V.'(</.)-<^-l]3/2 
25/2 




o_6 


71 ('^') 

n3/2 



^^"(a-2) , 3V.'(a-2) 



2 





23/2 



3/2 



4.2 Exponential Family Nonlinear Models 

We derive the skewness of the MLE P in EFNLMs. Under the parametrization 
t{?/, r~^(yu)} = yT^^{fi) — 6{r~^(yu)}, we have dT~^{fi)/dfi = V{fi)~^, d2 = —V~^, 
rfs = 2V~''^V^^\ W = {V^^{dfi/dr]y} and the model matrix is X. Thus, equation 
reduces to 



where the matrix was defined in Section 2. The skewness of the MLEs of (p and 
are equal to those of Section 4.1, since the nonlinearity does not affect these parameters. 
These results agree with those by Cavalcanti et al. (2009). 



4.3 Exponential Dispersion Models 

The skewness of the MLEs in EDMs has not been investigated and equation can be ap- 
plied for several EDMs discussed in J0rgensen's (1997b) book, although the application of 
equation (|8]) is a much more difficult problem. For example, J0rgensen (1997b) discusses 
the Tweedie class of distributions with power variance function defined by V{fi) = fi^. 
The cumulant generator function bs{6) for 5 7^ 1, 2 is 

bs{e) = {2 - 6)-' {{1 - 6)6}^^ , 

and bi{9) = exp(6') and b2{0) = — log(— 6^). We recognize for 6 = 0,2 and 3, the cumulant 
generator corresponding to the normal, gamma and inverse Gaussian distributions, res- 
pectively. There exist continuous EDMs generated by extreme stable distributions with 
support M and positive stable distributions for 5 < and S > 2, respectively, and 
compound Poisson distributions for 1 < (5 < 2. Setting a = {6 — 2)/{6 — 1), the function 
a(0, y) for these two classes of models can be obtained from J0rgensen (1997b). For S < 
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(a; G M), we have 



a(0, y) = - log (Try) + l^g | "^^^^ 5){-yy ^-^'^^-''^ 

where 

r(l + J /a) f a ^ ^'Z" 



For S > 2 {y > 0), a{(j),y) is given by 

{oo 
^ m(j, S)y~^(, 

where 

aT{l+ja) [(5-1^"^-' 



-0-,5) = -y^^3I^|^^j sin(-,vra). 

Our formulas do not depend on these complicated functions which are used only to 
estimate for computing the skewness of the MLEs of /3. 

We also would like to remark that there exists an exponential dispersion model with 
exponential variance function, V{fi) = e^, for more details see the book of Jorgensen 
(1997b). 

Table H] provides the basic quantities for the skewness in generalized hyperbolic se- 
cant (GHS), negative binomial distributions, as well as for the skewness in the Tweedie 
distributions with power and exponential variance functions. These special cases have 
not been discussed in the literature so far. The GHS distribution is defined by taking 
h{9) = — log{cos(^^)}, whereas the term a{(j),y) in ([1]) is given by 

r 2{i-2</.)/</. ^ ^ ( y^ } 



Table 4: Expressions for ^2, its derivative and ^3 for some EDMs. 



Distribution 



do 



d' 



d?. 



(2At^+10At) 



GHS 

Neg. Bin. 

Power Var. 
Exp. Var. 



(/^2 + l)2 



(^2 



-1)3 

1 



(p+i) 



(At2 + 1)3 
2 



+ ^ 
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4.4 Proper Dispersion Models 

For PDMs, equation ([5]) has no reduction, since the only difference between PDMs and 
DMs is the form of the function a(-, ■), which can be decomposed as a{(j), y) = ai{(j))+a2{y). 
We now give the second-order third cumulant of and a^. For PDMs, a^^-* = — na"(0), 
a3_o = ?T,a'"(0), a^^^ = —na'l'{(f)) and a2,i = 0. Using ([H]), we have 

'^3(0) = n and 7i(0) - 



For 0"^, we obtain 



and 



n2{2e((x2) + a2^"(cT2)}3 
2a{a2r(a2) + 3r(a2)} 



v/^{-2e(a2) -(T2e(a2)}3" 

The form of a(0, for this case is different of that one for the two-parameter exponential 
family models but the expressions for the third cumulant and skewness of (f) and are 
identical. 

We illustrate the idea on a particular example of PDM. We consider the von Mises 
regression model that is quite useful for modeling circular data (see, Mardia (1972) and 
Fisher (1993)). Here, the density function is given by 

7i"(l/;/i,0) = 7r^7-Texp{0cos(?/ -/i)}, (9) 

where — vr < y < tt, — tt < /i < tt, > 0, and denotes the modified Bessel function of the 
first kind and order v (see Abramowitz and Stegun, 1970, Eq. 9.6.1). The density (jH]) is 
symmetric around y = n which is both the mode and the circular mean of the distribution. 
Here, is a precision parameter in the sense that when it increases, the density function 
(|9]) becomes more concentrated around fi. Clearly, the density fucntion ([9]) is a PDM, 
since t{y,fi) = cos{y — jj) and ai(0) = log{Jo(0)}. We investigate the skewness of the 
estimaate of {3. We have £'{sin(y — /i)} = and -E'[{cos(F — yu)}^] = 1 — 0~"'^r(0), where 
r(0) = /i(0)//o(0). These results yield c?2 = — t(0) and = d'2 = 0. The matrix W is 
W = dia.g{{dfi/dri)'^r{(j))} and we can obtain the inverse of the information matrix, and 
the matrices M and N. Further, / = {dfi/dri){d'^fi/dri'^)r{(f)), g = {dfi/dri){d'^fi/dri'^)r{(f)) 
and e = 0. Hence, formula ([5]) yields 

If the link function is the identity function, i.e. rj = fi, then w = r{(f)) and f = g = e = 0. 
For a linear von Mises regression model with identity link function, K3{f3a) = 0. For a 
nonlinear model, we obtain 

/A X 3r(0) ^ 

^ i=l 
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First, for the skewness of the MLEs of (j) and o"^, we have = h{4') and I'i{4>) = 

Io{(l>) ~ A(0)/0 (Abramowitz and Stegun, 1970; equations 9.6.26 and 9.6.27). Then, 
a"((/)) = r'(0) and a"'{(j)) = r"[(j)), where r(0) = Ji(0)//o(0) as before. Hence, we obtain 
for the von Mises model 

'^sl^) = ^777^ and 71 (0) - 



nV(0)3 '^'^^ V^{-r'(0)}3/2- 
Now, some similar calculations yield 



2^12 {3aV((7-2) +r"(a-2)} 



n2{r'(a-2)}3 

and then 



71 ('^') 



2{-3a2r'(a-2) -r"(a-2)} 



a67^{-r'((T-2)}3/2 

Table Ellists the quantities required for several PDMs, whereas Table Elgives the skewness 
of the MLEs of ch and for some PDMs. 



Table 5: Expressions of its derivative and in PDMs. 



Distribution 


d2 


d'2 


d3 


Rec. Gamma 




2/2-3 


2/i-3 


Log-Gamma 


-1 





1 


Rec. Inv. Gauss. 









Von-Mises 


-r{(f)) 
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Table 6: Skewness of ch and o"^ for some PDMs. 



Distribution 



^3(0) 



71 (0) 



Rec. Gamma 
Rec. Inv. Gauss. 
Log- Gamma 
von-Mises 



20(l+<?^^»/>"(<?^)] 
n2[l-0V'(</')P 



20(l+02^"(0)] 

n2[l-0V'(</')P 
-2r"{<l>) 



-2K'(0)+<?^-~ 
25/2 

-2[V>"(0)+0-^] 
yH[V'(</')-</'-l]3/2 
-2r"((/)) 
v^fr'(0)]3/2 



Rec. Gamma 
Rec. Inv. Gauss. 
Log-Gamma 
von-Mises 



— — + — 7i 7^ 



n ^ 7 IT 



2(Ti2[r'(a-2)CT2+r"(CT-2) 
n2[r'(g-2)]:i 



2[r'(g-2)^2_^^//(^-2)] 
v^[r'((T-2)]3/2 



4.5 Some Other Special Submodels 

We investigate some special cases which were first studied by Cordeiro (1985). If we take 
t{y,6) = yfi — b{fi), ([T]) is a one parameter exponential family indexed by the canonical 
parameter /i. Now, we assume that t{y,fi) involves a known constant parameter c for 
all observations, say t{y,fi) = t{y,fi,c), and that 0=1 and a{(l),y) = a{c,y). Several 
models can be defined in this framework: normal N{fi, c^fi"^), log-normal LN{fi, c^fi'^) and 
inverse Gaussian IG{fi, c^/x^) distributions with mean /i and known constant coefficient 
of variation c and Weibull W{fi, c) distribution with mean fi and known constant shape 
parameter c. Here, the normal and inverse Gaussian distributions are not standard GLMs 
since we consider a different parametrization. 

For these models, we have d2 = — /c2/^~^, ds = k^fi'^ and ^2 = 2A;2/i~^, where 
k2 and are known positive functions of c (see Table [7]). The matrix W becomes 
W = diag{k2^~'^{dn/dr))'^} and we can obtain the inverse of the information matrix and 
the matrices M and A^. Further, w = k2^Ji~'^{d^/ drjY ^ f = k2fi~'^{dfi/dr]){d'^fi/dri'^) — 
k3fi~^{dfi/driY, g = k2fi~'^{dfi/dri){d'^fi/dri'^) and e = —2k2fi'^{dfi/dri)^. Then, equation 
(ED yields 

f^30a) = '^ rnaili6k2-ks)lJ,'''(^^^ ~ ^^^'""^^^ | ~ ^'^^''^^'^^ 
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Table 7: Expressions of /c2 and for the normal, inverse Gaussian, log-normal and 
WeibuU distributions. 



Model 


k2 


A:3 


Normal {N{^,c^^^)) 
Inverse Gaussian {IG{fi,c^fi'^)) 
Log-normal {LN{fi,c^fi'^)) 


c-2(l + 2c2) 


c-2(6 + lOc^) 


l/2c-2(l + c2) 


C-2(3 + 


[log(l + c2)]-i 


3[log(l + c2)]-i 


Weibull (Vr(/i,c)) 


c2 


c2(c + 3) 



5 Simulation results 

We present some simulation results for the finite-sample distributions of the skewness of 
the MLEs of /3, and a^. We use a reciprocal gamma model with square root link 

^/JTi = (3o + PiXi^i + xj, z = 1, . . . , n, 

where the true values of the parameters were taken as [3q = 1/2, (3i = 1, (32 = 2 and = 4. 
The elements of the n x 3 matrix X are: Xj^i = 1; Xj^2 = and Xj^3 = log(x2.i)x2^j. 
The explanatory variables Xi and X2 were generated from the uniform U(0, 1) and U{1, 2) 
distributions, respectively, for n = 20,40 and 60. The values of xi and X2 were held 
constant throughout the simulations. The number of Monte Carlo replications was set 
at 10, 000 and all simulations were performed using the statistical software R. 

In each of the 10, 000 replications, we fitted the model and computed the MLEs 
/3o? A, (^2, the fitted values fii, . . . , fin, 4> and a^. Then, we computed their estimated 
asymptotic skewness 7i(/3o), 7i(/3i), 7i(/32), 7i(0) and 71(0-^), where each unknown value is 
replaced by its MLE, and their true asymptotic skewness 7i(/3o), 7i(/3i), 7i(/32), 7i(0) and 
7i((T^). By true asymptotic skewness we mean the asymptotic skewness calculated by us- 
ing the true values of the regression parameters. We then computed the sample skewness 
930o), 930i), 9302), 93i(j)) and gsia^), where ^3(-) is given by ^3(0) = 1713(0) /m2{af^^ , 

r 1 J / \ V^IOOOO/ —\r 1 — 1 v^lOOOO 

for a scalar a, and mr[a) = 2^j=i [ai — a) and a = yqooo ^j=i 

Table [8] gives the sample means of the estimated skewness 7i(/3o), 7i(/3i) and %02), 
the true skewness 7i(/3o), 7i(/3i) and 71 (/32), and the sample skewness g30o) , g3{f3i) and 

9302). 



Table 8: Estimated, true and sample skewness of /3o, /3i and 



n 


71 (/3o) 


71 (/3o) 


93(^0) 


7i(/3i) 


71 (/3i) 


93i(3i) 


71 (/32) 


71 (/32) 


^3(/32) 


20 


-0.1132 


-0.1355 


-0.6519 


-0.0454 


-0.0753 


-0.1408 


1.2430 


2.4222 


7.9249 


40 


-0.0989 


-0.1121 


-0.3233 


-0.0365 


-0.0577 


-0.1282 


0.9846 


1.6590 


3.5451 


60 


-0.0545 


-0.0870 


-0.1684 


-0.0184 


-0.0405 


-0.0997 


0.5673 


1.1399 


2.1091 



The figures in Table [8] show that the sample and analytical skewness decrease as the 
sample size increases, in agreement with the first-order asymptotic theory. We also note 
that Pq and (3i are always negatively skewed, whereas is always positively skewed. In 
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most of the cases, the sample skewness larger, in absolute values, than the estimated 
asymptotic skewness. We note that the estimated and true asymptotic skewness are not 
far apart. We observe large differences between the sample skewness and the estimated 
asymptotic skewness for n = 20. The explanation for such behavior is that the expected 
value of rrir is equal to the rth central moment of the population if we neglect terms of 
order n"^/^. These terms, however, are not negligible for small sample sizes. 

Table [9] gives the sample mean of the estimated asymptotic skewness 71 (0) and 71(0"^) 
out of 10,000 values, the true asymptotic skewness 71 (0) and 71(0"^), and the sample 
skewness, ^3(0) and ^3(0"^). 



Table 9: Estimated, true and sample skewness of and a^. 



n 


7i(0) 


71 (0) 


^3(0) 


7i(^') 


71 (^^) 


^3(5-^) 


20 


0.8732 


1.0911 


1.8858 


0.3154 


0.4878 


0.6998 


40 


0.6493 


0.7809 


1.0072 


0.2682 


0.3449 


0.4364 


60 


0.3927 


0.5443 


0.8003 


0.2319 


0.2816 


0.2922 



From the figures in Tabled it is clear that the asymptotic normality of and a^, often 
used in DMs is not achieved for small values of n. The results in this table suggest that 
there is a quite reasonable agreement between the analytical and the sample skewness. 
The estimated and true skewness are quite close even for small values of n. 

6 Conclusion 

In in this article, we introduce the dispersion models (DMs) with a regression systematic 
component to extend the well-known generalized linear models (GLMs), the exponential 
family nonlinear models (EFNLMs) (Cordeiro and Paula, 1989) and the class of proper 
dispersion models (PDMs) (J0rgensen, 1997a). Several properties of distributions in the 
class of DMs are discussed in the excellent book of J0rgensen (1997b). For the first time, 
we derive the second-order skewness of the MLEs of the regression parameters in DMs 
using formulae obtained by Bowman and Shenton (1998). 

We obtain an explicit matrix expression for the skewness of the maximum likelihood 
estimate (MLE) of the regression parameter vector /3. We also derive the skewness of the 
MLEs of the precision and dispersion parameters. Our results generalize those obtained 
by Cordeiro and Cordeiro (2001) and Cavalcanti et al. (2009), and also provide new 
results for some special submodels such as the exponential dispersion models and PDMs. 
In particular, we discuss results for the Von-Mises regression model. We perform a 
simulation study in a nonlinear reciprocal gamma model that indicates that the normal 
approximation usually employed with MLEs in DMs can be misleading in samples with 
small to moderate sizes. 
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